Dataset statistics
| Number of variables | 25 |
|---|---|
| Number of observations | 2823 |
| Missing cells | 5157 |
| Missing cells (%) | 7.3% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 551.5 KiB |
| Average record size in memory | 200.0 B |
Variable types
| CAT | 18 |
|---|---|
| NUM | 7 |
ORDERDATE has a high cardinality: 252 distinct values | High cardinality |
PRODUCTCODE has a high cardinality: 109 distinct values | High cardinality |
CUSTOMERNAME has a high cardinality: 92 distinct values | High cardinality |
PHONE has a high cardinality: 91 distinct values | High cardinality |
ADDRESSLINE1 has a high cardinality: 92 distinct values | High cardinality |
CITY has a high cardinality: 73 distinct values | High cardinality |
POSTALCODE has a high cardinality: 73 distinct values | High cardinality |
CONTACTLASTNAME has a high cardinality: 77 distinct values | High cardinality |
CONTACTFIRSTNAME has a high cardinality: 72 distinct values | High cardinality |
MONTH_ID is highly correlated with QTR_ID | High correlation |
QTR_ID is highly correlated with MONTH_ID | High correlation |
YEAR_ID is highly correlated with ORDERNUMBER | High correlation |
ORDERNUMBER is highly correlated with YEAR_ID | High correlation |
PHONE is highly correlated with CUSTOMERNAME and 9 other fields | High correlation |
CUSTOMERNAME is highly correlated with PHONE and 9 other fields | High correlation |
ADDRESSLINE1 is highly correlated with CUSTOMERNAME and 9 other fields | High correlation |
ADDRESSLINE2 is highly correlated with CUSTOMERNAME and 9 other fields | High correlation |
CITY is highly correlated with CUSTOMERNAME and 8 other fields | High correlation |
STATE is highly correlated with CUSTOMERNAME and 7 other fields | High correlation |
POSTALCODE is highly correlated with CUSTOMERNAME and 9 other fields | High correlation |
COUNTRY is highly correlated with CUSTOMERNAME and 9 other fields | High correlation |
TERRITORY is highly correlated with CUSTOMERNAME and 9 other fields | High correlation |
CONTACTLASTNAME is highly correlated with CUSTOMERNAME and 8 other fields | High correlation |
CONTACTFIRSTNAME is highly correlated with CUSTOMERNAME and 7 other fields | High correlation |
ADDRESSLINE2 has 2521 (89.3%) missing values | Missing |
STATE has 1486 (52.6%) missing values | Missing |
POSTALCODE has 76 (2.7%) missing values | Missing |
TERRITORY has 1074 (38.0%) missing values | Missing |
PRODUCTCODE is uniformly distributed | Uniform |
Reproduction
| Analysis started | 2020-12-12 09:54:39.516945 |
|---|---|
| Analysis finished | 2020-12-12 09:56:35.664690 |
| Duration | 1 minute and 56.15 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 307 |
|---|---|
| Distinct (%) | 10.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10258.72512 |
|---|---|
| Minimum | 10100 |
| Maximum | 10425 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 22.1 KiB |
Quantile statistics
| Minimum | 10100 |
|---|---|
| 5-th percentile | 10115 |
| Q1 | 10180 |
| median | 10262 |
| Q3 | 10333.5 |
| 95-th percentile | 10405 |
| Maximum | 10425 |
| Range | 325 |
| Interquartile range (IQR) | 153.5 |
Descriptive statistics
| Standard deviation | 92.0854776 |
|---|---|
| Coefficient of variation (CV) | 0.008976308124 |
| Kurtosis | -1.173309247 |
| Mean | 10258.72512 |
| Median Absolute Deviation (MAD) | 79 |
| Skewness | 0.01382298874 |
| Sum | 28960381 |
| Variance | 8479.735184 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 10332 | 18 | 0.6% | |
| 10386 | 18 | 0.6% | |
| 10165 | 18 | 0.6% | |
| 10159 | 18 | 0.6% | |
| 10168 | 18 | 0.6% | |
| 10275 | 18 | 0.6% | |
| 10222 | 18 | 0.6% | |
| 10398 | 18 | 0.6% | |
| 10106 | 18 | 0.6% | |
| 10316 | 18 | 0.6% | |
| Other values (297) | 2643 | 93.6% |
| Value | Count | Frequency (%) | |
| 10100 | 4 | 0.1% | |
| 10101 | 4 | 0.1% | |
| 10102 | 2 | 0.1% | |
| 10103 | 16 | 0.6% | |
| 10104 | 13 | 0.5% |
| Value | Count | Frequency (%) | |
| 10425 | 13 | 0.5% | |
| 10424 | 6 | 0.2% | |
| 10423 | 5 | 0.2% | |
| 10422 | 2 | 0.1% | |
| 10421 | 2 | 0.1% |
QUANTITYORDERED
Real number (ℝ≥0)
| Distinct | 58 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.09280907 |
|---|---|
| Minimum | 6 |
| Maximum | 97 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 22.1 KiB |
Quantile statistics
| Minimum | 6 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 27 |
| median | 35 |
| Q3 | 43 |
| 95-th percentile | 49 |
| Maximum | 97 |
| Range | 91 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 9.741442737 |
|---|---|
| Coefficient of variation (CV) | 0.2775908511 |
| Kurtosis | 0.4157437898 |
| Mean | 35.09280907 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.3625853288 |
| Sum | 99067 |
| Variance | 94.8957066 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 34 | 112 | 4.0% | |
| 21 | 103 | 3.6% | |
| 46 | 101 | 3.6% | |
| 27 | 100 | 3.5% | |
| 45 | 97 | 3.4% | |
| 41 | 97 | 3.4% | |
| 31 | 97 | 3.4% | |
| 26 | 96 | 3.4% | |
| 48 | 94 | 3.3% | |
| 25 | 94 | 3.3% | |
| Other values (48) | 1832 | 64.9% |
| Value | Count | Frequency (%) | |
| 6 | 2 | 0.1% | |
| 10 | 2 | 0.1% | |
| 11 | 2 | 0.1% | |
| 12 | 1 | < 0.1% | |
| 13 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 97 | 1 | < 0.1% | |
| 85 | 1 | < 0.1% | |
| 77 | 1 | < 0.1% | |
| 76 | 3 | 0.1% | |
| 70 | 2 | 0.1% |
PRICEEACH
Real number (ℝ≥0)
| Distinct | 1016 |
|---|---|
| Distinct (%) | 36.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 83.6585441 |
|---|---|
| Minimum | 26.88 |
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 22.1 KiB |
Quantile statistics
| Minimum | 26.88 |
|---|---|
| 5-th percentile | 42.67 |
| Q1 | 68.86 |
| median | 95.7 |
| Q3 | 100 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 73.12 |
| Interquartile range (IQR) | 31.14 |
Descriptive statistics
| Standard deviation | 20.17427653 |
|---|---|
| Coefficient of variation (CV) | 0.2411502225 |
| Kurtosis | -0.374817693 |
| Mean | 83.6585441 |
| Median Absolute Deviation (MAD) | 4.3 |
| Skewness | -0.946648859 |
| Sum | 236168.07 |
| Variance | 407.0014334 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 100 | 1304 | 46.2% | |
| 96.34 | 6 | 0.2% | |
| 59.87 | 6 | 0.2% | |
| 67.14 | 5 | 0.2% | |
| 51.93 | 5 | 0.2% | |
| 80.55 | 5 | 0.2% | |
| 57.73 | 5 | 0.2% | |
| 61.99 | 5 | 0.2% | |
| 90.17 | 5 | 0.2% | |
| 89.38 | 5 | 0.2% | |
| Other values (1006) | 1472 | 52.1% |
| Value | Count | Frequency (%) | |
| 26.88 | 1 | < 0.1% | |
| 27.22 | 1 | < 0.1% | |
| 28.29 | 1 | < 0.1% | |
| 28.88 | 1 | < 0.1% | |
| 29.21 | 2 | 0.1% |
| Value | Count | Frequency (%) | |
| 100 | 1304 | 46.2% | |
| 99.91 | 1 | < 0.1% | |
| 99.82 | 2 | 0.1% | |
| 99.72 | 1 | < 0.1% | |
| 99.69 | 1 | < 0.1% |
ORDERLINENUMBER
Real number (ℝ≥0)
| Distinct | 18 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.46617074 |
|---|---|
| Minimum | 1 |
| Maximum | 18 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 22.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 9 |
| 95-th percentile | 14 |
| Maximum | 18 |
| Range | 17 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.225840965 |
|---|---|
| Coefficient of variation (CV) | 0.6535306806 |
| Kurtosis | -0.5611542428 |
| Mean | 6.46617074 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.5907412107 |
| Sum | 18254 |
| Variance | 17.85773186 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 307 | 10.9% | |
| 2 | 291 | 10.3% | |
| 3 | 270 | 9.6% | |
| 4 | 256 | 9.1% | |
| 5 | 239 | 8.5% | |
| 6 | 221 | 7.8% | |
| 7 | 197 | 7.0% | |
| 8 | 187 | 6.6% | |
| 9 | 165 | 5.8% | |
| 10 | 141 | 5.0% | |
| Other values (8) | 549 | 19.4% |
| Value | Count | Frequency (%) | |
| 1 | 307 | 10.9% | |
| 2 | 291 | 10.3% | |
| 3 | 270 | 9.6% | |
| 4 | 256 | 9.1% | |
| 5 | 239 | 8.5% |
| Value | Count | Frequency (%) | |
| 18 | 10 | 0.4% | |
| 17 | 25 | 0.9% | |
| 16 | 42 | 1.5% | |
| 15 | 56 | 2.0% | |
| 14 | 81 | 2.9% |
SALES
Real number (ℝ≥0)
| Distinct | 2763 |
|---|---|
| Distinct (%) | 97.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3553.889072 |
|---|---|
| Minimum | 482.13 |
| Maximum | 14082.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 22.1 KiB |
Quantile statistics
| Minimum | 482.13 |
|---|---|
| 5-th percentile | 1268.757 |
| Q1 | 2203.43 |
| median | 3184.8 |
| Q3 | 4508 |
| 95-th percentile | 7108.12 |
| Maximum | 14082.8 |
| Range | 13600.67 |
| Interquartile range (IQR) | 2304.57 |
Descriptive statistics
| Standard deviation | 1841.865106 |
|---|---|
| Coefficient of variation (CV) | 0.5182674722 |
| Kurtosis | 1.792676469 |
| Mean | 3553.889072 |
| Median Absolute Deviation (MAD) | 1102.31 |
| Skewness | 1.161076001 |
| Sum | 10032628.85 |
| Variance | 3392467.068 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3003 | 3 | 0.1% | |
| 1666.7 | 2 | 0.1% | |
| 5984.14 | 2 | 0.1% | |
| 1030.44 | 2 | 0.1% | |
| 2935.15 | 2 | 0.1% | |
| 2795.27 | 2 | 0.1% | |
| 1463 | 2 | 0.1% | |
| 1742.4 | 2 | 0.1% | |
| 2441.04 | 2 | 0.1% | |
| 2620.8 | 2 | 0.1% | |
| Other values (2753) | 2802 | 99.3% |
| Value | Count | Frequency (%) | |
| 482.13 | 1 | < 0.1% | |
| 541.14 | 1 | < 0.1% | |
| 553.95 | 1 | < 0.1% | |
| 577.6 | 1 | < 0.1% | |
| 640.05 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 14082.8 | 1 | < 0.1% | |
| 12536.5 | 1 | < 0.1% | |
| 12001 | 1 | < 0.1% | |
| 11887.8 | 1 | < 0.1% | |
| 11886.6 | 1 | < 0.1% |
| Distinct | 252 |
|---|---|
| Distinct (%) | 8.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| 11/14/2003 0:00 | 38 |
|---|---|
| 11/24/2004 0:00 | 35 |
| 11/12/2003 0:00 | 34 |
| 11/17/2004 0:00 | 32 |
| 11/4/2004 0:00 | 29 |
| Other values (247) |
| Value | Count | Frequency (%) | |
| 11/14/2003 0:00 | 38 | 1.3% | |
| 11/24/2004 0:00 | 35 | 1.2% | |
| 11/12/2003 0:00 | 34 | 1.2% | |
| 11/17/2004 0:00 | 32 | 1.1% | |
| 11/4/2004 0:00 | 29 | 1.0% | |
| 10/16/2004 0:00 | 28 | 1.0% | |
| 12/2/2003 0:00 | 28 | 1.0% | |
| 11/5/2003 0:00 | 28 | 1.0% | |
| 11/6/2003 0:00 | 27 | 1.0% | |
| 8/20/2004 0:00 | 27 | 1.0% | |
| Other values (242) | 2517 | 89.2% |
Unique
| Unique | 9 ? |
|---|---|
| Unique (%) | 0.3% |
Length
| Max length | 15 |
|---|---|
| Median length | 14 |
| Mean length | 14.04463337 |
| Min length | 13 |
STATUS
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| Shipped | |
|---|---|
| Cancelled | 60 |
| Resolved | 47 |
| On Hold | 44 |
| In Process | 41 |
| Value | Count | Frequency (%) | |
| Shipped | 2617 | 92.7% | |
| Cancelled | 60 | 2.1% | |
| Resolved | 47 | 1.7% | |
| On Hold | 44 | 1.6% | |
| In Process | 41 | 1.5% | |
| Disputed | 14 | 0.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 7 |
| Mean length | 7.107686858 |
| Min length | 7 |
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| 4 | |
|---|---|
| 1 | |
| 2 | |
| 3 |
| Value | Count | Frequency (%) | |
| 4 | 1094 | 38.8% | |
| 1 | 665 | 23.6% | |
| 2 | 561 | 19.9% | |
| 3 | 503 | 17.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.092454835 |
|---|---|
| Minimum | 1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 22.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 8 |
| Q3 | 11 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 3.656633308 |
|---|---|
| Coefficient of variation (CV) | 0.515566668 |
| Kurtosis | -1.38327478 |
| Mean | 7.092454835 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.2729015635 |
| Sum | 20022 |
| Variance | 13.37096715 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 11 | 597 | 21.1% | |
| 10 | 317 | 11.2% | |
| 5 | 252 | 8.9% | |
| 1 | 229 | 8.1% | |
| 2 | 224 | 7.9% | |
| 3 | 212 | 7.5% | |
| 8 | 191 | 6.8% | |
| 12 | 180 | 6.4% | |
| 4 | 178 | 6.3% | |
| 9 | 171 | 6.1% | |
| Other values (2) | 272 | 9.6% |
| Value | Count | Frequency (%) | |
| 1 | 229 | 8.1% | |
| 2 | 224 | 7.9% | |
| 3 | 212 | 7.5% | |
| 4 | 178 | 6.3% | |
| 5 | 252 | 8.9% |
| Value | Count | Frequency (%) | |
| 12 | 180 | 6.4% | |
| 11 | 597 | 21.1% | |
| 10 | 317 | 11.2% | |
| 9 | 171 | 6.1% | |
| 8 | 191 | 6.8% |
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| 2004 | |
|---|---|
| 2003 | |
| 2005 |
| Value | Count | Frequency (%) | |
| 2004 | 1345 | 47.6% | |
| 2003 | 1000 | 35.4% | |
| 2005 | 478 | 16.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
PRODUCTLINE
Categorical
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| Classic Cars | |
|---|---|
| Vintage Cars | |
| Motorcycles | |
| Planes | |
| Trucks and Buses | |
| Other values (2) |
| Value | Count | Frequency (%) | |
| Classic Cars | 967 | 34.3% | |
| Vintage Cars | 607 | 21.5% | |
| Motorcycles | 331 | 11.7% | |
| Planes | 306 | 10.8% | |
| Trucks and Buses | 301 | 10.7% | |
| Ships | 234 | 8.3% | |
| Trains | 77 | 2.7% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 16 |
|---|---|
| Median length | 12 |
| Mean length | 10.91498406 |
| Min length | 5 |
MSRP
Real number (ℝ≥0)
| Distinct | 80 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 100.7155508 |
|---|---|
| Minimum | 33 |
| Maximum | 214 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 22.1 KiB |
Quantile statistics
| Minimum | 33 |
|---|---|
| 5-th percentile | 43 |
| Q1 | 68 |
| median | 99 |
| Q3 | 124 |
| 95-th percentile | 170 |
| Maximum | 214 |
| Range | 181 |
| Interquartile range (IQR) | 56 |
Descriptive statistics
| Standard deviation | 40.18791168 |
|---|---|
| Coefficient of variation (CV) | 0.3990238979 |
| Kurtosis | -0.1318145207 |
| Mean | 100.7155508 |
| Median Absolute Deviation (MAD) | 28 |
| Skewness | 0.5801750539 |
| Sum | 284320 |
| Variance | 1615.068245 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 118 | 104 | 3.7% | |
| 99 | 103 | 3.6% | |
| 136 | 80 | 2.8% | |
| 62 | 78 | 2.8% | |
| 68 | 77 | 2.7% | |
| 60 | 76 | 2.7% | |
| 80 | 73 | 2.6% | |
| 101 | 54 | 1.9% | |
| 115 | 54 | 1.9% | |
| 54 | 54 | 1.9% | |
| Other values (70) | 2070 | 73.3% |
| Value | Count | Frequency (%) | |
| 33 | 25 | 0.9% | |
| 35 | 28 | 1.0% | |
| 37 | 27 | 1.0% | |
| 40 | 25 | 0.9% | |
| 41 | 22 | 0.8% |
| Value | Count | Frequency (%) | |
| 214 | 28 | 1.0% | |
| 207 | 26 | 0.9% | |
| 194 | 25 | 0.9% | |
| 193 | 26 | 0.9% | |
| 173 | 26 | 0.9% |
| Distinct | 109 |
|---|---|
| Distinct (%) | 3.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| S18_3232 | 52 |
|---|---|
| S32_2509 | 28 |
| S10_4962 | 28 |
| S18_1097 | 28 |
| S50_1392 | 28 |
| Other values (104) |
| Value | Count | Frequency (%) | |
| S18_3232 | 52 | 1.8% | |
| S32_2509 | 28 | 1.0% | |
| S10_4962 | 28 | 1.0% | |
| S18_1097 | 28 | 1.0% | |
| S50_1392 | 28 | 1.0% | |
| S18_2432 | 28 | 1.0% | |
| S12_1666 | 28 | 1.0% | |
| S24_2840 | 28 | 1.0% | |
| S24_1444 | 28 | 1.0% | |
| S10_1949 | 28 | 1.0% | |
| Other values (99) | 2519 | 89.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 9 |
|---|---|
| Median length | 8 |
| Mean length | 8.110874956 |
| Min length | 8 |
| Distinct | 92 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| Euro Shopping Channel | |
|---|---|
| Mini Gifts Distributors Ltd. | 180 |
| Australian Collectors, Co. | 55 |
| La Rochelle Gifts | 53 |
| AV Stores, Co. | 51 |
| Other values (87) |
| Value | Count | Frequency (%) | |
| Euro Shopping Channel | 259 | 9.2% | |
| Mini Gifts Distributors Ltd. | 180 | 6.4% | |
| Australian Collectors, Co. | 55 | 1.9% | |
| La Rochelle Gifts | 53 | 1.9% | |
| AV Stores, Co. | 51 | 1.8% | |
| Land of Toys Inc. | 49 | 1.7% | |
| Rovelli Gifts | 48 | 1.7% | |
| Muscle Machine Inc | 48 | 1.7% | |
| Anna's Decorations, Ltd | 46 | 1.6% | |
| Souveniers And Things Co. | 46 | 1.6% | |
| Other values (82) | 1988 | 70.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 34 |
|---|---|
| Median length | 21 |
| Mean length | 20.97272405 |
| Min length | 10 |
| Distinct | 91 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| (91) 555 94 44 | |
|---|---|
| 4155551450 | 180 |
| 03 9520 4555 | 55 |
| 40.67.8555 | 53 |
| (171) 555-1555 | 51 |
| Other values (86) |
| Value | Count | Frequency (%) | |
| (91) 555 94 44 | 259 | 9.2% | |
| 4155551450 | 180 | 6.4% | |
| 03 9520 4555 | 55 | 1.9% | |
| 40.67.8555 | 53 | 1.9% | |
| (171) 555-1555 | 51 | 1.8% | |
| 6175558555 | 51 | 1.8% | |
| 2125557818 | 49 | 1.7% | |
| 035-640555 | 48 | 1.7% | |
| 2125557413 | 48 | 1.7% | |
| +61 2 9495 8555 | 46 | 1.6% | |
| Other values (81) | 1983 | 70.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 17 |
|---|---|
| Median length | 10 |
| Mean length | 11.63655685 |
| Min length | 9 |
| Distinct | 92 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| C/ Moralzarzal, 86 | |
|---|---|
| 5677 Strong St. | 180 |
| 636 St Kilda Road | 55 |
| 67, rue des Cinquante Otages | 53 |
| Fauntleroy Circus | 51 |
| Other values (87) |
| Value | Count | Frequency (%) | |
| C/ Moralzarzal, 86 | 259 | 9.2% | |
| 5677 Strong St. | 180 | 6.4% | |
| 636 St Kilda Road | 55 | 1.9% | |
| 67, rue des Cinquante Otages | 53 | 1.9% | |
| Fauntleroy Circus | 51 | 1.8% | |
| 897 Long Airport Avenue | 49 | 1.7% | |
| 4092 Furth Circle | 48 | 1.7% | |
| Via Ludovico il Moro 22 | 48 | 1.7% | |
| 201 Miller Street | 46 | 1.6% | |
| Monitor Money Building, 815 Pacific Hwy | 46 | 1.6% | |
| Other values (82) | 1988 | 70.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 42 |
|---|---|
| Median length | 18 |
| Mean length | 19.44597945 |
| Min length | 11 |
| Distinct | 9 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 2521 |
| Missing (%) | 89.3% |
| Memory size | 22.1 KiB |
| Level 3 | |
|---|---|
| Suite 400 | |
| Level 15 | |
| Level 6 | |
| 2nd Floor | |
| Other values (4) |
| Value | Count | Frequency (%) | |
| Level 3 | 55 | 1.9% | |
| Suite 400 | 48 | 1.7% | |
| Level 15 | 46 | 1.6% | |
| Level 6 | 46 | 1.6% | |
| 2nd Floor | 36 | 1.3% | |
| Suite 101 | 25 | 0.9% | |
| Suite 750 | 20 | 0.7% | |
| Floor No. 4 | 16 | 0.6% | |
| Suite 200 | 10 | 0.4% | |
| (Missing) | 2521 | 89.3% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 11 |
|---|---|
| Median length | 3 |
| Mean length | 3.565356004 |
| Min length | 3 |
| Distinct | 73 |
|---|---|
| Distinct (%) | 2.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| Madrid | |
|---|---|
| San Rafael | 180 |
| NYC | 152 |
| Singapore | 79 |
| Paris | 70 |
| Other values (68) |
| Value | Count | Frequency (%) | |
| Madrid | 304 | 10.8% | |
| San Rafael | 180 | 6.4% | |
| NYC | 152 | 5.4% | |
| Singapore | 79 | 2.8% | |
| Paris | 70 | 2.5% | |
| San Francisco | 62 | 2.2% | |
| New Bedford | 61 | 2.2% | |
| Nantes | 60 | 2.1% | |
| Melbourne | 55 | 1.9% | |
| Manchester | 51 | 1.8% | |
| Other values (63) | 1749 | 62.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 14 |
|---|---|
| Median length | 8 |
| Mean length | 7.753099539 |
| Min length | 3 |
| Distinct | 16 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 1486 |
| Missing (%) | 52.6% |
| Memory size | 22.1 KiB |
| CA | |
|---|---|
| MA | |
| NY | |
| NSW | |
| Victoria | |
| Other values (11) |
| Value | Count | Frequency (%) | |
| CA | 416 | 14.7% | |
| MA | 190 | 6.7% | |
| NY | 178 | 6.3% | |
| NSW | 92 | 3.3% | |
| Victoria | 78 | 2.8% | |
| PA | 75 | 2.7% | |
| CT | 61 | 2.2% | |
| BC | 48 | 1.7% | |
| NH | 34 | 1.2% | |
| Tokyo | 32 | 1.1% | |
| Other values (6) | 133 | 4.7% | |
| (Missing) | 1486 | 52.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 13 |
|---|---|
| Median length | 3 |
| Mean length | 2.955012398 |
| Min length | 2 |
| Distinct | 73 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 76 |
| Missing (%) | 2.7% |
| Memory size | 22.1 KiB |
| 28034 | |
|---|---|
| 97562 | |
| 10022 | 152 |
| 94217 | 89 |
| 50553 | 61 |
| Other values (68) |
| Value | Count | Frequency (%) | |
| 28034 | 259 | 9.2% | |
| 97562 | 205 | 7.3% | |
| 10022 | 152 | 5.4% | |
| 94217 | 89 | 3.2% | |
| 50553 | 61 | 2.2% | |
| 44000 | 60 | 2.1% | |
| 3004 | 55 | 1.9% | |
| EC2 5NT | 51 | 1.8% | |
| 24100 | 48 | 1.7% | |
| 58339 | 47 | 1.7% | |
| Other values (63) | 1720 | 60.9% | |
| (Missing) | 76 | 2.7% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 9 |
|---|---|
| Median length | 5 |
| Mean length | 5.153737159 |
| Min length | 1 |
| Distinct | 19 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| USA | |
|---|---|
| Spain | |
| France | |
| Australia | |
| UK | |
| Other values (14) |
| Value | Count | Frequency (%) | |
| USA | 1004 | 35.6% | |
| Spain | 342 | 12.1% | |
| France | 314 | 11.1% | |
| Australia | 185 | 6.6% | |
| UK | 144 | 5.1% | |
| Italy | 113 | 4.0% | |
| Finland | 92 | 3.3% | |
| Norway | 85 | 3.0% | |
| Singapore | 79 | 2.8% | |
| Canada | 70 | 2.5% | |
| Other values (9) | 395 | 14.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 11 |
|---|---|
| Median length | 5 |
| Mean length | 5.044633369 |
| Min length | 2 |
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 1074 |
| Missing (%) | 38.0% |
| Memory size | 22.1 KiB |
| EMEA | |
|---|---|
| APAC | |
| Japan | 121 |
| Value | Count | Frequency (%) | |
| EMEA | 1407 | 49.8% | |
| APAC | 221 | 7.8% | |
| Japan | 121 | 4.3% | |
| (Missing) | 1074 | 38.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 3.66241587 |
| Min length | 3 |
| Distinct | 77 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| Freyre | |
|---|---|
| Nelson | 204 |
| Young | 115 |
| Frick | 91 |
| Brown | 88 |
| Other values (72) |
| Value | Count | Frequency (%) | |
| Freyre | 259 | 9.2% | |
| Nelson | 204 | 7.2% | |
| Young | 115 | 4.1% | |
| Frick | 91 | 3.2% | |
| Brown | 88 | 3.1% | |
| Yu | 80 | 2.8% | |
| Hernandez | 70 | 2.5% | |
| Ferguson | 55 | 1.9% | |
| King | 54 | 1.9% | |
| Labrune | 53 | 1.9% | |
| Other values (67) | 1754 | 62.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 11 |
|---|---|
| Median length | 6 |
| Mean length | 6.441374424 |
| Min length | 2 |
| Distinct | 72 |
|---|---|
| Distinct (%) | 2.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| Diego | |
|---|---|
| Valarie | |
| Julie | 117 |
| Michael | 84 |
| Sue | 84 |
| Other values (67) |
| Value | Count | Frequency (%) | |
| Diego | 259 | 9.2% | |
| Valarie | 257 | 9.1% | |
| Julie | 117 | 4.1% | |
| Michael | 84 | 3.0% | |
| Sue | 84 | 3.0% | |
| Juri | 60 | 2.1% | |
| Maria | 58 | 2.1% | |
| Elizabeth | 55 | 1.9% | |
| Peter | 55 | 1.9% | |
| Janine | 53 | 1.9% | |
| Other values (62) | 1741 | 61.7% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 5 |
| Mean length | 5.668083599 |
| Min length | 3 |
DEALSIZE
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.1 KiB |
| Medium | |
|---|---|
| Small | |
| Large |
| Value | Count | Frequency (%) | |
| Medium | 1384 | 49.0% | |
| Small | 1282 | 45.4% | |
| Large | 157 | 5.6% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 5.49025859 |
| Min length | 5 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| ORDERNUMBER | QUANTITYORDERED | PRICEEACH | ORDERLINENUMBER | SALES | ORDERDATE | STATUS | QTR_ID | MONTH_ID | YEAR_ID | PRODUCTLINE | MSRP | PRODUCTCODE | CUSTOMERNAME | PHONE | ADDRESSLINE1 | ADDRESSLINE2 | CITY | STATE | POSTALCODE | COUNTRY | TERRITORY | CONTACTLASTNAME | CONTACTFIRSTNAME | DEALSIZE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 10107 | 30 | 95.70 | 2 | 2871.00 | 2/24/2003 0:00 | Shipped | 1 | 2 | 2003 | Motorcycles | 95 | S10_1678 | Land of Toys Inc. | 2125557818 | 897 Long Airport Avenue | NaN | NYC | NY | 10022 | USA | NaN | Yu | Kwai | Small |
| 1 | 10121 | 34 | 81.35 | 5 | 2765.90 | 5/7/2003 0:00 | Shipped | 2 | 5 | 2003 | Motorcycles | 95 | S10_1678 | Reims Collectables | 26.47.1555 | 59 rue de l'Abbaye | NaN | Reims | NaN | 51100 | France | EMEA | Henriot | Paul | Small |
| 2 | 10134 | 41 | 94.74 | 2 | 3884.34 | 7/1/2003 0:00 | Shipped | 3 | 7 | 2003 | Motorcycles | 95 | S10_1678 | Lyon Souveniers | +33 1 46 62 7555 | 27 rue du Colonel Pierre Avia | NaN | Paris | NaN | 75508 | France | EMEA | Da Cunha | Daniel | Medium |
| 3 | 10145 | 45 | 83.26 | 6 | 3746.70 | 8/25/2003 0:00 | Shipped | 3 | 8 | 2003 | Motorcycles | 95 | S10_1678 | Toys4GrownUps.com | 6265557265 | 78934 Hillside Dr. | NaN | Pasadena | CA | 90003 | USA | NaN | Young | Julie | Medium |
| 4 | 10159 | 49 | 100.00 | 14 | 5205.27 | 10/10/2003 0:00 | Shipped | 4 | 10 | 2003 | Motorcycles | 95 | S10_1678 | Corporate Gift Ideas Co. | 6505551386 | 7734 Strong St. | NaN | San Francisco | CA | NaN | USA | NaN | Brown | Julie | Medium |
| 5 | 10168 | 36 | 96.66 | 1 | 3479.76 | 10/28/2003 0:00 | Shipped | 4 | 10 | 2003 | Motorcycles | 95 | S10_1678 | Technics Stores Inc. | 6505556809 | 9408 Furth Circle | NaN | Burlingame | CA | 94217 | USA | NaN | Hirano | Juri | Medium |
| 6 | 10180 | 29 | 86.13 | 9 | 2497.77 | 11/11/2003 0:00 | Shipped | 4 | 11 | 2003 | Motorcycles | 95 | S10_1678 | Daedalus Designs Imports | 20.16.1555 | 184, chausse de Tournai | NaN | Lille | NaN | 59000 | France | EMEA | Rance | Martine | Small |
| 7 | 10188 | 48 | 100.00 | 1 | 5512.32 | 11/18/2003 0:00 | Shipped | 4 | 11 | 2003 | Motorcycles | 95 | S10_1678 | Herkku Gifts | +47 2267 3215 | Drammen 121, PR 744 Sentrum | NaN | Bergen | NaN | N 5804 | Norway | EMEA | Oeztan | Veysel | Medium |
| 8 | 10201 | 22 | 98.57 | 2 | 2168.54 | 12/1/2003 0:00 | Shipped | 4 | 12 | 2003 | Motorcycles | 95 | S10_1678 | Mini Wheels Co. | 6505555787 | 5557 North Pendale Street | NaN | San Francisco | CA | NaN | USA | NaN | Murphy | Julie | Small |
| 9 | 10211 | 41 | 100.00 | 14 | 4708.44 | 1/15/2004 0:00 | Shipped | 1 | 1 | 2004 | Motorcycles | 95 | S10_1678 | Auto Canal Petit | (1) 47.55.6555 | 25, rue Lauriston | NaN | Paris | NaN | 75016 | France | EMEA | Perrier | Dominique | Medium |
Last rows
| ORDERNUMBER | QUANTITYORDERED | PRICEEACH | ORDERLINENUMBER | SALES | ORDERDATE | STATUS | QTR_ID | MONTH_ID | YEAR_ID | PRODUCTLINE | MSRP | PRODUCTCODE | CUSTOMERNAME | PHONE | ADDRESSLINE1 | ADDRESSLINE2 | CITY | STATE | POSTALCODE | COUNTRY | TERRITORY | CONTACTLASTNAME | CONTACTFIRSTNAME | DEALSIZE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2813 | 10293 | 32 | 60.06 | 1 | 1921.92 | 9/9/2004 0:00 | Shipped | 3 | 9 | 2004 | Ships | 54 | S72_3212 | Amica Models & Co. | 011-4988555 | Via Monte Bianco 34 | NaN | Torino | NaN | 10100 | Italy | EMEA | Accorti | Paolo | Small |
| 2814 | 10306 | 35 | 59.51 | 6 | 2082.85 | 10/14/2004 0:00 | Shipped | 4 | 10 | 2004 | Ships | 54 | S72_3212 | AV Stores, Co. | (171) 555-1555 | Fauntleroy Circus | NaN | Manchester | NaN | EC2 5NT | UK | EMEA | Ashworth | Victoria | Small |
| 2815 | 10315 | 40 | 55.69 | 5 | 2227.60 | 10/29/2004 0:00 | Shipped | 4 | 10 | 2004 | Ships | 54 | S72_3212 | La Rochelle Gifts | 40.67.8555 | 67, rue des Cinquante Otages | NaN | Nantes | NaN | 44000 | France | EMEA | Labrune | Janine | Small |
| 2816 | 10327 | 37 | 86.74 | 4 | 3209.38 | 11/10/2004 0:00 | Resolved | 4 | 11 | 2004 | Ships | 54 | S72_3212 | Danish Wholesale Imports | 31 12 3555 | Vinb'ltet 34 | NaN | Kobenhavn | NaN | 1734 | Denmark | EMEA | Petersen | Jytte | Medium |
| 2817 | 10337 | 42 | 97.16 | 5 | 4080.72 | 11/21/2004 0:00 | Shipped | 4 | 11 | 2004 | Ships | 54 | S72_3212 | Classic Legends Inc. | 2125558493 | 5905 Pompton St. | Suite 750 | NYC | NY | 10022 | USA | NaN | Hernandez | Maria | Medium |
| 2818 | 10350 | 20 | 100.00 | 15 | 2244.40 | 12/2/2004 0:00 | Shipped | 4 | 12 | 2004 | Ships | 54 | S72_3212 | Euro Shopping Channel | (91) 555 94 44 | C/ Moralzarzal, 86 | NaN | Madrid | NaN | 28034 | Spain | EMEA | Freyre | Diego | Small |
| 2819 | 10373 | 29 | 100.00 | 1 | 3978.51 | 1/31/2005 0:00 | Shipped | 1 | 1 | 2005 | Ships | 54 | S72_3212 | Oulu Toy Supplies, Inc. | 981-443655 | Torikatu 38 | NaN | Oulu | NaN | 90110 | Finland | EMEA | Koskitalo | Pirkko | Medium |
| 2820 | 10386 | 43 | 100.00 | 4 | 5417.57 | 3/1/2005 0:00 | Resolved | 1 | 3 | 2005 | Ships | 54 | S72_3212 | Euro Shopping Channel | (91) 555 94 44 | C/ Moralzarzal, 86 | NaN | Madrid | NaN | 28034 | Spain | EMEA | Freyre | Diego | Medium |
| 2821 | 10397 | 34 | 62.24 | 1 | 2116.16 | 3/28/2005 0:00 | Shipped | 1 | 3 | 2005 | Ships | 54 | S72_3212 | Alpha Cognac | 61.77.6555 | 1 rue Alsace-Lorraine | NaN | Toulouse | NaN | 31000 | France | EMEA | Roulet | Annette | Small |
| 2822 | 10414 | 47 | 65.52 | 9 | 3079.44 | 5/6/2005 0:00 | On Hold | 2 | 5 | 2005 | Ships | 54 | S72_3212 | Gifts4AllAges.com | 6175559555 | 8616 Spinnaker Dr. | NaN | Boston | MA | 51003 | USA | NaN | Yoshido | Juri | Medium |